More and more code depends on knowing the number of processors in the
system to efficiently scale the code.  E.g., in OpenMP it is used by
default to determine how many threads to create.  Creating more threads
than there are processors/cores doesn't make sense.

glibc for a long time provides functionality to retrieve the number
through sysconf() and this is what fortunately most programs use.  The
problem is that we are currently using /proc/cpuinfo since this is all
there was available at that time.  Creating /proc/cpuinfo takes the
kernel quite a long time, unfortunately (I think Jakub said it is mainly
the interrupt information).

The alternative today is to use /sys/devices/system/cpu and count the
number of cpu* directories in it.  This is somewhat faster.  But there
would be another possibility: simply stat /sys/devices/system/cpu and
use st_nlink - 2.

This last step unfortunately it made impossible by recent changes:

  http://article.gmane.org/gmane.linux.kernel/413178

I would like to propose changing that patch, move the sched_*
pseudo-files in some other directly and permanently ban putting any new
file into /sys/devices/system/cpu.

To get some numbers, you can try

  http://people.redhat.com/drepper/nproc-timing.c

The numbers I see on x86-64:

cpuinfo 10145810 cycles for 100 accesses
readdir /sys 3113870 cycles for 100 accesses
stat /sys 741070 cycles for 100 accesses

Note that for the first two methods I skipped the actual parsing part.
This means in the real solution the gap between those two and the simple
stat() call is even bigger.

-- 
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to