> Does that answer your questions? 

Thanks Ben, for taking the time to write. I did find your response (and 
Matt's too) both interesting and enlightening.


> > Node.js uses non-blocking I/O when it can and only falls back to a 
> > thread pool when it must. 

So, like file-system I/O, what would be other examples of Node employing 
the worker thread pool. It seems, knowing this would be one of the 
essentials to writing a scalable node app.


> > 3. If the RDBMS instance (say, MySQL) is co-located on the Node server 
box, 
> > then would it be correct 
>  
> Communication with the database normally takes place over a TCP or 
> UNIX socket, 

I see that you assume (or recommend) a specific setup of web apps... e.g. 
in which the DBMS server is on a dedicated box of its own. This, 
incidentally, is typically the case with  << C10K apps also. So, Node's 
evented nature won't magically obviate the need of a dedicated DBMS server. 
(Note that I've never really developed or deployed even a modest scale web 
app, let alone a C10K scalable app - hence all this newbie ignorance.)

On Thursday, March 24, 2016 at 7:52:05 AM UTC+5:30, Ben Noordhuis wrote:
>
> Hello Harry, replies inline. 
>
> On Wed, Mar 23, 2016 at 5:25 PM, Harry Simons <[email protected] 
> <javascript:>> wrote: 
> > Hello, 
> > 
> > 
> > I have not been able to see the following points addressed in all the 
> online 
> > material I have read to date on Node, and so, hope to be enlightened by 
> some 
> > very smart and knowledegable folks here that I presume would be reading 
> > this. 
> > 
> > 
> > 1. Since I/O happens asynchronously in worker threads, it is possible 
> for a 
> > single Node process to quickly/efficiently accept 1000s of incoming 
> requests 
> > compared to something like Apache. But, surely, the outgoing responses 
> for 
> > each of those requests will take their own time, won't it? For example, 
> if 
> > an isolated and primarily an I/O bound request takes, say, 3 seconds to 
> get 
> > serviced (with no other load on the system), then if concurrently hit 
> with 
> > 5000 such requests, won't Node take a lot of time to service them all, 
> > fully? If this 3-second task happens to involve exclusive access to the 
> > disk, then it would take 5000 x 3 sec = 15000 seconds, or over 4 hours 
> of 
> > wait to see the response for the last request coming out of the Node 
> app. In 
> > such scenarios, would it be correct to claim that a single-process Node 
> > configuration can 'handle' 1000s of requests per second (granted, a 
> > thread-server like Apache would do a lot worse with 5000 threads) when 
> all 
> > that Node may be doing is simply putting the requests 'on hold' till 
> they 
> > get fully serviced instead of rejecting them outrightly on initial their 
> > arrival itself? I'm asking this because as I'm reading up on Node, I'm 
> often 
> > hearing how Node can address the C10K problem without any co-mention of 
> any 
> > specific application setups or any specific application types that Node 
> can 
> > or cannot handle... other than the broad, CPU- vs I/O-bound type of 
> > application classification. 
>
> Node.js uses non-blocking I/O when it can and only falls back to a 
> thread pool when it must.  Sockets, pipes, etc. are handled in 
> asynchronous, non-blocking fashion using native system APIs but e.g. 
> file I/O is offloaded to a thread pool. 
>
> > 2. What about the context switching overhead of the workers in the 
> > worker-thread pool? If C10K requests hit a Node-based application, won't 
> the 
> > workers in the worker-thread pool end up context-switching just as much 
> as 
> > the user threads in the thread pool of a regular, threaded-server (like 
> > Apache)...? because, all that would have happened in Node's event thread 
> > would be a quick request-parsing and request-routing, with the remainder 
> > (or, the bulk) of the processing still happening in the worker thread? 
> That 
> > is, does it really matter (as far as minimization of thread 
> > context-switching is concerned) whether a request/response is handled 
> from 
> > start to finish in a single thread (in the manner of threaded-server 
> like 
> > Apache), or whether it happens transparently in a Node-managed worker 
> thread 
> > with only minimal work (of request parsing and routing) subtracted from 
> it? 
> > Ignore here the simpler, single-threaded user model of coding that comes 
> > with an evented server like Node. 
>
> See above.  Depending on the application, you may not hit the thread 
> pool much or at all. 
>
> > 3. If the RDBMS instance (say, MySQL) is co-located on the Node server 
> box, 
> > then would it be correct to classify a database CRUD operation as a pure 
> I/O 
> > task? My understanding is, a CRUD operation on a large, relational 
> database 
> > will typically involve heavyduty CPU- and I/O-processing, and not just 
> > I/O-processing. However, the online material that I've been reading seem 
> to 
> > label a 'database call' as merely an 'I/O call' which supposedly makes 
> your 
> > application an I/O-bound application if that is the only the thing your 
> > application is (mostly) doing. 
>
> Communication with the database normally takes place over a TCP or 
> UNIX socket, so as far as node.js is concerned, it's not much 
> different from any other network connection.  The heavy-duty number 
> crunching takes place in a different process, the RDBMS. 
>
> > 4. A final question (related to the above themes) that may require 
> knowledge 
> > of modern hardware and OS which I am not fully up-to-date on. Can I/O 
> (on a 
> > given I/O device) be done in parallel, or even concurrently if not 
> > parallelly, and THUS, scale proportionally with user-count? Example: 
> Suppose 
> > I have written a file-serving Node app that serves files from the local 
> > hard-disk, making it strongly an I/O-bound app. If hit with N (== ~ 
> C10K) 
> > concurrent file serving requests, what I think would happen is this: 
> > 
> > The event-loop would spawn N async file-read requests, and go idle. 
> > (Alternatively, Node would have pre-spawned all its workers in the pool 
> on 
> > process startup.) 
> > The N async file-read requests would each get assigned to N worker 
> threads 
> > in the worker thread pool. If N > pool size, then the balance will be 
> made 
> > to wait in some sort of an internal queue to get assigned to a worker. 
> > Each file-read request would run concurrently at best, or sequentially 
> at 
> > worst - but definitely NOT N-parallelly. That is, even a RAID 
> configuration 
> > would be able to service merely a handful of file-read requests 
> parallelly, 
> > and certainly not all N parallelly. 
> > This would result in a large, total wait-time for the last file serving 
> > request to be served fully. 
>
> That's by and large correct. 
>
> > So, if all these 4 points are true, how could we really say that a 
> single 
> > Node-process based application is good for (because it scales well) for 
> > I/O-bound applications? Can the mere ability to receive a large number 
> of 
> > incoming requests and keep them all on hold indefinitely while their I/O 
> > fully completes (versus, rejecting them outrightly on their initial 
> arrival 
> > itself) be called 'servicing the requests'? Can such an application be 
> seen 
> > as scaling well with respect to user-count? 
>
> For many applications the answer is 'yes' because they can break up 
> the request in smaller parts that they can service independently. 
>
> Say your application has to read a) a file from disk, b) query a 
> database, and c) consult a web service before it can send a reply.  In 
> the traditional web server model, it takes a+b+c time, whereas with 
> the asynchronous model it's max(a,b,c). 
>
> max(a,b,c) <= a+b+c so worst case, it performs the same, but common 
> case, it's much faster. 
>
> Does that answer your questions? 
>

-- 
Job board: http://jobs.nodejs.org/
New group rules: 
https://gist.github.com/othiym23/9886289#file-moderation-policy-md
Old group rules: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
--- 
You received this message because you are subscribed to the Google Groups 
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/nodejs/bde23265-e244-47f9-95be-4b261de0848d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to