Joe Schaefer writes: > > experience, the only way to build large scale systems is with > > stateless, single-threaded servers. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Could you say some more about what you mean by this? Do you mean > something like > > "use a functional language (like Haskell or Scheme), rather > than an imperative language (like C, Java, Perl ...)",
Not exactly, but this is an interesting topic. > or are you talking more about the application's platform and design > (e.g. http://www.kegel.com/c10k.html )? This article addresses path length, which is the "single-threaded" part. Scalability is not addressed. Both parts are important to understand when you build enterprise systems. I changed the subject to Neo-Classical Transaction Processing which is the way I look at web applications. If you'll bear with me, I can explain the Neo and Classical parts with a picture. Here's a classical transaction processing system: T ______ e +----------------+ +--------------+ / \ r -----| |------| |-----------\______/ m | Transaction | | +--------------+ | | i -----| Monitor |---------| |--------| | n | | | | +--------------+ | DB | a -----| |------------| |-----| | l | | | | | Custom | | | s +----------------+ +--| | Servers | | | | | | | | +--| | | | | | \______/ +--------------+ Now here's a a typical (large) Apache/mod_perl setup (Neo-Classical): ______ B +----------------+ +--------------+ / \ r -----| |------| |-----------\______/ o | Apache | | +--------------+ | | w -----| mod_proxy |---------| |--------| | s | | | | +--------------+ | DB | e -----| |------------| |-----| | r | | | | | Apache | | | s +----------------+ +--| | mod_perl | | | | | | | | +--| | | | | | \______/ +--------------+ The browsers are connected to a fast IP router(s), equivalent to yesteryears I/O processor. The mod_proxy servers are simple switches, just like the TM. (Unlike the TM, front-ends don't manage the transactions.) It's usually a given that the front-ends are stateless. Their job is dynamically routing for load-balancing and reliability. They also serve icons and other stateless files. If a front-end crashes, the IP router ignores it and goes to another front-end. No harm done. The IP router also balances the load, something that isn't provided by classical I/O processors. The mod_perl servers are the work horses, just like the custom servers. In a classical OLTP system, the customer servers are stateless, that is, if a server goes down, the TM/mod_proxy server routes around it. (The TM rollsback any transactions and restarts the entire request, which is interesting but irrelevant for this discussion.) If the work servers are fully loaded, you simply add more hardware. If all the servers are stateless, the system scales linearly, i.e. the number of servers is directly proportional to the number of users that can be served. That's the stateless part. Threading is the other issue. Should the servers (mod_proxy or mod_perl) be threaded. In classical OLTPs, the work servers are single threaded (as in one request at a time) and the TM handles multiple simultaneous requests, but isn't "multi-threaded" (in the Java sense). The work server can be thought of as a resource unit. Usually it represents a fair bit of code and takes up a chunk of memory. It can only process so many requests per unit time. If the work server is multi-threaded, it is harder to manage resources and configure for peak load. In the single threaded model, each work server (process) is a reservation for the resources it needs for one request. In a multi-threaded model, the resource reservations are less clear. It might have a shared database connection pool or it might have two simultaneous requests which need more memory. The meaning of capacity becomes fuzzy. If the whole multi-threaded server ever has to wait on a single shared resource (e.g. the garbage collector), all simultaneous requests in progress lose. This doesn't happen if the work servers are single threaded, i.e. "shared nothing". You can easily model the resources required to process a request and no request has to wait on another, except to get CPU. The front-end servers are different. They do no work. Their job is switching requests/responses. Some of the requests are going directly to the OS (e.g. get an icon) and others are going to the work servers. The front-end servers "wait" most of the time. Here I buy into the the asynchronous I/O model of serving multiple requests and not the lightweight process model. The asynch I/O model (variantions of which are discussed in the c10k paper) is a way to achieve asynchrony much more efficiently than threads. Instead of reserving a stack and some OS space for each client, you reserve a block of memory in a single-threaded process. The OS is not involved. The main loop evaluates which I/O ports are ready and processes them serially. With preemptable threads, as in Java, the processor is used inefficiently, because there's an unnecessary and unpredictable process switching cost. The asynch I/O model does not incur this overhead, which can be significant wrt to the cost of processing each I/O. Preemptable threading is an "easy" solution to the asynchrony problem, but it is resource intensive. What's scary about modern distributed application servers is that they hide the asynchrony problem. This works fine until system must scale to hundreds and thousands of users. It becomes very hard to determine how the server will behave at and above peak load. Rob