Remember one machine has one kernel running (which itself is a program) and all your servers will be competing to get access to it. The sudden jump in user CPU is when each game is having to poll to get exclusive access to the kernel to perform the same sort of operation. Your kernel cannot keep up. There are certain lock/mutex operations where it polls rather than waits, as it doesn't expect to be held up. That polling would shoot up on user cpu time. This happens suddenly as it's either keeping up, or its not. When it can't keep up a polling queue suddenly forms.
This is why faster cpus are better then many slower as the kernel can run faster and keep up with the requests from more game programs. You can prove this to yourself by running "strace -c -p <pid>" on one of the games when they are all lagging. Wait about 10 seconds, then ctrl-C to exit. This produces a summary of where all the time is being spent. Also run this on a game when everything is fine as a comparison. If the strace shows all the time is spent on kernel contention, you should look to run less servers on your box. It's the number of active servers in conjunction with the number of slots which is getting you in to that position. Each individual game server is trying to access the kernel. On UKCS we always go for the fastest latest generation processors, have a maximum of 4 cores per kernel, and never run more than one game per core. This avoids any contention issues. But we're a public game group, not a game rental group where I appreciate its not cost effective for you to go to that extreme. You could also look at your kernel config to ensure you've made it as near realtime as you can to optimise the situation. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of John Morgan Sent: 20 September 2008 23:08 To: hlds_linux@list.valvesoftware.com Subject: [hlds_linux] Intel Quad-Core Xeon - problem with CPU load Hello ppl, I've a huge problem with performance of my servers. I'm hosting arround 30 gameservers per 1 server, mostly CS 1.6, few CS:S and CS:CZ. Allways when ~160-200 players play CPUs load jump from 30% to ~70-80% (suddenly all processes are using more power of CPU) and players get lags. No matter what kind of gameservers are working. On 1 server I'm put only CS 1.6, on another I mixed CS 1.6 with CS:S. Both has these same problem. I tried in 2 different Data Centers so it's not a network problem. Maybe my system was improperly setup. Configuration of servers: 2x Quad E5420, motherboard S5000PAL, 6x2GB FB-DIMM 667, 2x SAS with RAID 1 (i tried with 1 disk, no positive result). System: Slackware 12.1 and Centos 5.2 Kernel: 2.6.24.3, 2.6.25.4, 2.6.26.2 for Slack and 2.6.18-92.el5PAE for Centos (default) Kernel's Processor Settings which I tested: BIG SMP/PC Compatible, 300/1000Hz, low-latency/server, preempt big kernel on/off, Core 2/newer Xeon How it looks: Players: http://img81.imageshack.us/my.php?image=playersrt1.jpg CPU Load: http://img221.imageshack.us/my.php?image=cpucr5.jpg I don't know what can I do. Old Dual Xeons 3.0Ghz with HT works fine with ~100. I think that 2x Quad-Core Xeon should works fine with 300 playing players. Don't You think so? P.S. Sorry for my english ;-) Regards, J.Morgan _______________________________________________ To unsubscribe, edit your list preferences, or view the list archives, please visit: http://list.valvesoftware.com/mailman/listinfo/hlds_linux _______________________________________________ To unsubscribe, edit your list preferences, or view the list archives, please visit: http://list.valvesoftware.com/mailman/listinfo/hlds_linux