> A better solution would be is to just completely profile the engine
> to see what it's spending it's time doing.
>

Done that - very call heavy, 8%+ time in bitbuffers, something like 4% in
thunks.

The engine also spends HZ/sec reading the clock. Do you really need
> to call the clock that many times? Solution:
> cache the last timestamp and update it to the engine only when
> necessary. Or use rdtsc or open /dev/hpet and read it. (the former is
> the better choice, most newer systems have TSC's that don't require a
> cpuid instruction)


I used to think this was a big issue too, but in my various attempts to hack
srcds into running better, it became apparent that this is more of a red
herring. On an up to date distro with newest kernel/glib the gettimeofday
call, context switch and all, runs something like 600nanoseconds on a core2
based cpu. The context switch might cause efficiency issues a la the cache,
but it's more of a red herring i believe. Interestingly,
clock_gettime(CLOCK_THREAD_CPUTIME_ID) takes like 800ns, even though it has
no context switch and is implemented with RDTSC on supported platforms.

They are still using usleep()'s, which is only good down to -4 or
> -5.. they should just use nanosleeps and avoid a couple of
> extra paths (not really and optimizations, just saves a couple of
> steps that glibc does). But extra accuracy increases cpu overhead :P


I have a plugin that redirects the sleep on frame loop to nanosleep and lets
me tweak the rate. It makes things a little more stable, but has no major
impact. On ancient glibcs it might be an issue, but i don't think it matters
too much.

- Neph
_______________________________________________
To unsubscribe, edit your list preferences, or view the list archives, please 
visit:
http://list.valvesoftware.com/mailman/listinfo/hlds_linux

Reply via email to