At 10:49 AM 1/26/2008, Vince W. wrote:
We see the same thing.  Intel 5310 1.6GHz quadcore Xeon 8MB cache, 4MB
RAM.   Running Centos 5.1 x86 with a couple different flavors of custom
kernels compiled for max performance with these CPU's, and
glibc-2.5-18.el5_1.1.

(Note: we also saw same issues on a 3GHz dual core box running single
instances.)

We run 28-slot servers, currently 3 of them, on the box.  We have seen
CPU utilization at >90% on one server instance with the server full and
at a "busy" point, even with the other 2 instances empty, and
performance of the game as described below.  We only run 1 plugin
(beetlesmod), though performance is same without it.

Up to this point I have not attempted to use taskset to "lock" processes
to individual CPU cores, as I am not sure what its effect would be if
there is any multi-core utilization going on in the server.  I found
some discussion of this the other day when I discovered the
host_thread_mode cvar but then subsequently learned googling that it
doesn't do anything for dedicated servers, but that some(...) multi-core
utilization is already in the dedicated server binaries.  So it doesn't
seem to make sense to taskset the game to only 1 core unless I'm just
not understanding.  I'm also thinking that I would probably lose the
benefit of trying to keep the gameserver processes running on specific
CPU cores, and at the same time accomodate any multicore processing
going on in the dedicated server, by using taskset to bind each process
to 2 specific cores (and "mixing it up" a bit with which cores each
process gets taskset to).

I may just be over-analyzing all of this with a less-than-perfect
understanding of any of it.

But what I do know is, TF2 uses a LOT more CPU on Linux than CS:S does
to handle the same level of activity/# of busy player slots - at least
twice as much, from what I have seen.  I can't help wondering if there
is not a lot more optimization that could be done with the code, or how
it is compiled, so that it can run more efficiently in Linux.

The server drops to low double-digit server fps when things get really
busy - for example, heavy firefights, or the end of stage 2 dustbowl
when everybody is shooting, all engies banging on their gear, etc., and
lags noticeably.  The incoming updates from the server sometimes dips as
low as 15 per second, and stats output showing as low as 20 server fps.
We had to reduce the slots from 32 to 28 to get this more under
control.  We also use Beetlemod's (small|medium|large)maps.cfg files to
vary the max ping limit allowed based on number of players on the
server, so when the server is full 300ms ping is max we allow before the
player gets kicked.

REALTIME/TICKLESS KERNEL
I've tried 2.6.24-rc5rt  -- tickless kernel, hires timers, compiled with
every compiler flag optimization the CPU family supports, kernel IRQ
balancing enabled (and userspace irqbalance daemon disabled) etc.  The
server will hover around 935+ server fps on this kernel under low
"in-game demand" (though I normally fps_max much lower - 200), but it
can't sustain it when things get busy in the game.  Based on top output,
the CPU load of each server instance seems to be spread across all 4 CPU
cores, but I'm not so sure this is a good thing.  (if we really are
doing that much task migration it is going to introduce overhead)

For me, the problem using the realtime kernel is that I keep seeing
"Warning: System clock went backwards 1 seconds" messages come through
in bursts in the game console periodically, and when this happens, the
server lags for 3-5 seconds - this really annoys the connected players
(and me).  I've tried both hpet and acpi_pm clocksources (the server has
both), but same problem with either.  This may be something needing to
be addressed in the kernel timers code with realtime patches in use, or
maybe even an indication of a jittery timers on the hardware (but I
don't think so).  Or maybe it's just how the gameserver reads the time.
But overall, I haven't seen where the game really runs noticeably better
on this kernel, regardless of how I tune the realtime priority (with
chrt) or nice levels.

This is an issue with RT kernels. timers are supposed to
monotonically increase for each read. HPET is your best bet for
anything. ACPI is a little slower to read.

I also took the stock Centos 5.1 (2.6.18-whatever) .src.rpm, went in and
set various performance-related settings available (CONFIG_HZ to 1000,
highest available compatible CPU type in menuconfig, full pre-emption,
etc.) and compiled it using all compiler flags the CPU supports
(core2duo march/mtune, -msse3, -mssse3, -mfpmath=sse, etc.).  Under this
kernel, I see no problems with the "clock went backwards" messages (and
therefore we don't get the big lag spikes periodically), and with
fps_max at 200, the server fps hovers around 166 under no/light load.
The gameserver processes seem to "stick" to particular cores more
consistently (you see the CPU utilization top reports on the process
directly impacting 1 core and not really touching the other 3).  But we
still see the same CPU utilization (approaching and sometimes exceeding
90%) when the game gets really busy, and same issues with updates and
server fps dropping into mid-to-low double digits.

Most of the issues you are having is trying to service high FPS on a
large server, therefor you're burning up so much CPU time on trying
to return exactly requested intervals. Also; you're trying patches
that do more harm, than good for what you're trying to accomplish.

If you're burning up alot of CPU time for things like gettimeofday(),
I suggest you try running x86_64, since gettimeofday won't be as
expensive to read compared to others because of vsyscalls. Stay away
from using RT patches, and you really don't need preemption (at all).
Compiler flags don't really affect anything other than microbenchmarks.


_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives, please 
visit:
http://list.valvesoftware.com/mailman/listinfo/hlds_linux

Reply via email to