Hey pluggers,
I've a couple questions on the Linux cluster computing software, Beowulf (I think that's what it was called). Anyone know if there's a maximum number of nodes that a cluster can support? And can the cluster master add new nodes on the fly so to speak? My understanding of cluster computing is that you can send a job to the master server, and the master server will in turn break the job between all the nodes in the cluster, then combine the results. I think it's not all that dissimilar to multi-processor specific tasks. Is this correct? Here's a scenario I could envision. Please tell me if there's a serious flaw in the idea. I hope it explains what I'm thinking of. Program X is intended to handle quite literally up to hundreds of millions of simultaneous TCP connections across a number of network interfaces (the exact number is unimportant). So as program X runs, the overall load begins to rise as X has to do more and more work for each of the connections. That much is a given. Take, for instance, an MMORPG. The more people login to the game and interact with the world, the more work is involved for the actual server. Same idea here. My thought was that if X was built to support clustering and it was run on a Beowulf cluster, that would slow the initial growth of the load. Then, supposing that the dynamic addition is possible, I can watch the system and if the system load average on the cluster gets too large (due to overwhelming the CPU, not the internet pipe), someone could reduce without the end user noticing anything simply by adding new machines. Is this right? Or do I have a major flaw in my thinking? If Beowulf isn't the software I'm thinking of, what is? And what other issues, other than perhaps memory usage, would be involved? And speaking of memory, is there a way to distribute the memory at all? I know that there are 8GB dimms available easily these days, and a high-end motherboard can usually support 8 dimms. So that's 64GB of ram. But for hundreds of millions of connections, I could see the possibility of exceeding memory capacity being a potential issue. I suppose I could find bigger dimms (I've heard rumors of 16GB and even 32GB dimms, but never seen them) but that's going to seriously jack up the server costs if I do it that way. So if there's a way to distribute the memory usage along with distributing the CPU work, that would be of great assistance. Thanks in advance, folks! --- Dan /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */
